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Abstract. In the 1960s, pioneers in artificial intelligence 
made grand claims that Al systems would surpass 
human intelligence before the end of the 20th century. 
Except for beating the world chess champion in 1997, 
none of the other predictions have come true. But Al 
research has contributed a huge amount of valuable 
technology, which has proved to be successful on 
narrow, specialized problems. Unfortunately, the field of 
Al has fragmented into those narrow specialties. Many 
researchers claim that their specialty is the key to 
solving all the problems. But the true key to Al is the 
knowledge that there is no key. Human intelligence 
comprises every specialty that anyone in any culture or 
civilization has ever dreamed of. Each one is adequate 
for a narrow range of applications. The power of human 
intelligence comes from the ability to relate, combine, 
and build on an open-ended variety of methods for 
different applications. Successful Al systems require a 
framework that can support any and all such 
combinations. 
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1 Early Success and 
Later Disappointment 


From 1945 to 1970, computer hardware, 
software, and theory developed rapidly. By the 
1960s, the prospects for artificial intelligence 
seemed unlimited. From a textbook on machine 
translation in 1960: “While a great deal remains 
to be done, it can be stated without hesitation that 
the essential has already been accomplished” [2]. 
In 1965, I. J. Good defined an ultraintelligent 
machine as a system that surpasses human 
ability in every field [4]. He predicted “It is more 
probable than not that, within the twentieth 


century, an ultraintelligent machine will be built 
and that it will be the last invention that man need 
make.” Marvin Minsky was a technical adviser for 
the Al features of the HAL 9000 computer in the 
movie 2001: A Space Odyssey. When the movie 
opened in 1968, he claimed that it was a 
“conservative estimate” of Al technology in 2001. 

Those claims seemed reasonable at the time. 
By the early 1960s, research on Al had produced 
an impressive body of results. Many of them were 
documented in the book Computers and 
Thought [3]: 


— In the first paper, Alan Turing asked the 
fundamental question of Al, “Can a machine 
think?” As a test for thinking, he proposed an 
imitation game: if people cannot distinguish a 
computer’s responses in a dialog from human 
responses, then they should consider the 
computer to be thinking at a human level [21]. 
Today’s consensus is that pattern-matching 
methods enable a computer to imitate human 
responses in a short conversation, but not in 
complex interactions. 


— Arthur Samuel invented machine learning 
methods that are still widely used today. He 
demonstrated their power in a checker- 
playing program that learned to play the game 
better than its designer. It even beat the 
Connecticut state champion [17]. 


— Other projects developed methods for 
answering questions in English, proving 
theorems in geometry, solving calculus 
problems, and managing an_ investment 
portfolio. With many variations and 
extensions, the methods are still the basis for 
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the Al toolkit today: list processing, pattern 
matching, grammars, logic, heuristics, 
Statistics, and neural networks. 


— But some papers made questionable claims 
about simulating human thought, concept 
formation, or language understanding. As in 
the Turing test, a successful imitation of short 
examples is not convincing evidence of 
deeper thought or understanding. 


For machine translation (MT), the technology 
of the 1960s produced useful results [8]. At the 
1964 World’s Fair, IBM demonstrated a system 
that translated Russian to English and printed 
both languages with the interchangeable 
bouncing balls of their Selectric typewriters. A 
competing system, the Georgetown University 
Automatic Translator (GAT), was widely used for 
translating physics research from Russian to 
English [6]. In the 1970s, an upgraded version of 
GAT was commercialized as SYSTRAN, which is 
still a widely used MT system. Today, free 
translators are available for any page on the 
WWW. But by the criterion of Fully Automatic 
High-Quality Translation (FAHQT), professional 
human translation is still far superior. 

The predictions by Good and Minsky were 
wrong because the time scale they considered 
was much too short. The exponential growth in 
hardware speed and capacity enabled a 
supercomputer to beat the world chess champion 
in 1997 [7]. But improvements in software theory 
and practice did not keep pace. From a historical 
perspective, the seemingly rapid growth in early 
computer science took advantage of many 
centuries of prior research: 


— Aristotle established the foundations for 
formal logic, ontology, and cognitive science. 
His theory of language was actively debated 
for centuries, and modern linguists have 
adopted aspects of his ontology [1]. 
Philosophers claim that his psychology 
provides better insights than 20th century 
behaviorism [15]. Even today, the RDF(S) 
notation for the Semantic Web does not use 
any logic that goes beyond Aristotle’s 
syllogisms. OWL is more expressive, but 
many of the published ontologies use only the 
Aristotelian subset [19]. 
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— Diagrams and mechanical aids for reasoning 
and computation have been used since 
antiquity. Euclid drew elaborate diagrams as 
essential components of proofs. In the 3rd c 
AD, Porphyry included a tree diagram in his 
introduction to Aristotle’s categories. In the 
14th c, Ramon Lull combined the tree of 
Porphyry with rotating circles as a method for 
relating categories to generate new ones. 
After studying Lull’s rotating circles, Leibniz 
developed an equivalent numeric 
method: map primitive concepts to prime 
numbers; multiply the primes to represent 
compound concepts; and use division to test 
whether one concept subsumes another. To 
automate the arithmetic, he invented the first 
mechanical calculator for multiplication and 
division. Leibniz’s method inspired Gödel to 
map all logical formulas to products of primes 
for the proof of his famous theorem. Leibniz 
also invented binary arithmetic, which Boole 
adopted for his logic and which Turing 
adopted for his machines, both abstract and 
electronic. 


— Medieval logic put more emphasis on 
semantics than syntax. In his Summa 
Logicae, William of Ockham developed a 
model-theoretic semantics for a subset of 
Latin. He combined Aristotle’s logic, the 
propositional logic of the Stoics, and even a 
subset of modal and temporal logic. Among 
the logicians who studied that logic were 
Bolzano, Brentano, Peirce, and the Polish 
school, which included Tarski. In 1887, Peirce 
published an article “Logical Machines’ in the 
American Journal of Psychology [16]. He 
described Babbage’s mechanical computers 
and machines for proving theorems in 
Boolean algebra. That paper was included in 
the bibliography of Computers and Thought. 


By 1980, the legacy of the previous centuries 
had been translated to a computable form. 
Applications of expert systems, pattern 
recognition, logic programming, and natural 
language processing (NLP) showed promise of 
great things to come. In 1982, the Japanese 
launched the Fifth Generation project based on Al 
software and massively parallel hardware. But by 
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the end of the 1980s, Al software did not scale to 
larger applications, and special-purpose hardware 
was less cost-efficient than the mass-produced 
microprocessors. The early predictions seemed 
unlikely, research funding dried up, and the field 
went into an “Al winter” [5]. 


2 Future Directions 


Since the 1990s, the huge volumes of data on the 
Internet made Al methods of deep reasoning and 
language analysis impractical. The research 
shifted to shallow statistical methods for 
information retrieval, data mining, and machine 
translation. The two most impressive successes 
for deeper methods used supercomputers. In 
1997, the Deep Blue system beat Gary Kasparov, 
the world chess champion [7]. In 2011, the 
Watson system beat two of the best Jeopardy! 
contestants [9,12]. But cynics claimed that the 
main purpose of those projects was advertising 
for IBM computers. 


Whatever the motivation, the chess system 
demonstrated the importance of hardware speed 
and capacity. But it did little to advance Al 
research. The Watson system, however, showed 
how a combination of language analysis, 
reasoning methods, machine learning, and large 
volumes of data could match human performance 
on challenging problems. With further 
improvements in hardware and software, a 
version of Watson running on more economical 
servers is being used to diagnose cancer and 
other diseases [10]. It doesn’t replace physicians, 
but it gives them better advice and more focused 
information than search engines. 


Despite some impressive applications, no Al 
system today can learn, understand, and use 
language as quickly and accurately as a 3-year- 
old child. Automated telephone systems are 
useless when the caller strays from a 
preprogrammed script. Computer help facilities 
are useless when the user doesn’t know or can’t 
remember the exact name of the command, 
feature, or menu item. To be successful, Al 
systems don’t have to as intelligent as the HAL 
9000. But they need to be flexible, adaptable, 
helpful, and able to communicate in the user’s 


native language. For specialized applications, 
they should be at least as advanced as Watson. 
But they should be able to learn those 
applications by reading books and asking 
questions. Whether they pass the Turing test is 
irrelevant. 


The Al technology developed in the past 60 
years is sufficient to support such systems. No 
major breakthroughs were necessary to 
implement Watson. It was assembled in a few 
years by putting together readily available Al 
components. Its English parser, for example, is 
over 20 years old [12]. A special-purpose pattern 
matcher was designed for Watson, but it turned 
out to be slower and less general than the Prolog 
language, which is over 30 years old [9]. But 
Watson required a great deal of applied research 
to tailor the components, make them work 
together, and test the many combinations on 
typical Jeopardy! questions. Unfortunately, few 
programmers and system analysts have the 
expertise to design and maintain such systems. 


The great strength of Al technology is its 
coverage of nearly every aspect of intelligence. 
But its great weakness is fragmentation. 
Researchers who specialize in any area try to 
make their favorite set of tools do everything. 
Logicians combine formal logics with formal 
ontologies, formal grammars, and formal methods 
for mapping one to another. Specialists in neural 
networks try to solve every problem by combining 
multiple networks to form deeper networks. The 
strength of Watson is that it combines multiple 
modules based on different paradigms. But those 
modules are lashed together with procedural 
code. In the book Society of Mind, Minsky 
proposed a “society” of active processes as a way 
of managing that diversity [13]: 


What magical trick makes us intelligent? The 
trick is that there is no trick. The power of 
intelligence stems from our vast diversity, not 
from any single, perfect principle. Our species has 
evolved many effective although imperfect 
methods, and each of us individually develops 
more on our own. Eventually, very few of our 
actions and decisions come to depend on any 
single mechanism. Instead, they emerge from 
conflicts and negotiations among societies of 
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processes that constantly challenge one another. 
(§30.8) 


This view is radically different from the 
assumption of a unified formal logic that cannot 
tolerate a single inconsistency. Minsky’s goal is to 
build a flexible, fault-tolerant system. To provide 
the motivation that drives an intelligent system, 
Minsky extended his Society of Mind with an 
Emotion Engine [14]. But much more detail is 
needed to specify how the processes can and 
should interact in an efficient computer 
implementation. 


As an architecture that can support a society 
of interacting agents, Sowa designed the Flexible 
Modular Framework (FMF), which enables 
heterogeneous processes to interact by passing 
messages in various languages [18]. Majumdar 
used the FMF to support a hierarchy of agents 
that behave like the managers and employees of 
a business [11]. The chief executive officer (CEO) 
gives the organization a coherent “personality” for 
external interactions. Beneath the CEO are vice 
presidents in charge of major divisions, directors 
of important functions, lower-level managers, and 
specialists that perform an open-ended variety of 
cognitive tasks. For an overview of these and 
other promising designs, see the article “Future 
directions in semantic systems” [20]. 
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